This paper introduces a new similarity measure, the covering similarity, thatwe formally define for evaluating the similarity between a symbolic sequenceand a set of symbolic sequences. A pair-wise similarity can also be directlyderived from the covering similarity to compare two symbolic sequences. Anefficient implementation to compute the covering similarity is proposed thatuses a suffix tree data-structure, but other implementations, based on suffixarray for instance, are possible and possibly necessary for handling largescale problems. We have used this similarity to isolate attack sequences fromnormal sequences in the scope of Host-based Intrusion Detection. The experimentwe have carried out on two well-known benchmarks in the field, in view of theresults provided by state of the art methods, demonstrates the efficiency andusefulness of the proposed approach.
展开▼